Main Analysis
After forming a group based on our common interest in the Food Environment Atlas dataset, each team member performed independent preliminary exploratory data analysis. This involved reading the documentation to learn about the variables and gaining a deeper understanding of the dataset. As there are 211 features in this dataset, it was clear that we needed to focus the scope of our analysis. After our initial investigations, we outlined which variables were particularly interesting to each of us. The general consensus was to focus on food insecurity, and try to narrow in on related factors.
One of our initial hunches was to look at the relationship between low access or proximity to stores, with food insecurity. Moreover, the dataset includes variables related to farmer’s markets, which led to another hypothesis – for families not in close proximity to stores, most of their food may be from farmer’s markets. As shown later, this was generally not the case.
After we had a better understanding of the direction of our analysis, we compiled a list of 108 variables to analyze. The downloadable dataset is an Excel document with multiple sheets, so we grouped these variables into one data frame for ease of analysis (see code “data_cleaning.R”). Overall, the data in this Atlas was “clean”, but what we spent most of our “pre-processing” time doing was interpreting variables and evaluating their relationship to our overall objective.
# Quick data cleaning
# Remove PR - only look at 50 states + DC
fea <- filter(fea, State != "PR")
# For county plotting purposes
fea$region <- as.numeric(fea$FIPS)
fea %>%
select(-region) %>%
group_by(State) %>%
summarise_all(mean) ->
state_data
# Bring in state abbreviation to name mapping
data(state)
state_abbrevs <- tibble(abb = state.abb, region = tolower(state.name))
state_abbrevs <- add_row(state_abbrevs, abb = 'DC', region = 'district of columbia')
state_data <- left_join(state_data, state_abbrevs, by = c('State' = 'abb'))
Food Insecurity in the United States
First, we wanted to get a better sense of food insecurity in the United States. The two main variables we considered was the three-year average percentage of households with food insecurity, for the year ranges 2007-09 and 2010-12. The Atlas contains these values as a state average, not at the county level. First we looked at the distribution of food insecurity over time. This shows us the ranges for each time frame, with a notable increase in the maximum value from 2007-09 to 2010-12. Additionally, the median for these distributions is higher in the later time period, indicating an overall increase in food insecurity among states.
insec_bar <- state_data %>% select(State, FOODINSEC_07_09, FOODINSEC_10_12)
insec_bar_tidy <- insec_bar %>% gather("Year", "Value", -State)
# Density plot of food insecurity, comparing years
med_0709 <- median(filter(insec_bar_tidy, Year == 'FOODINSEC_07_09')$Value / 100)
med_1012 <- median(filter(insec_bar_tidy, Year == 'FOODINSEC_10_12')$Value / 100)
ggplot(insec_bar_tidy) +
geom_density(aes(Value / 100, fill = Year, color = Year), alpha = 0.4, adjust = 0.7) +
scale_x_continuous(labels = scales::percent) +
xlab('Food Insecurity') +
ggtitle('Household Food Insecurity (3-year avg), with US Medians') +
scale_fill_manual(name = 'Year', labels = c('2007-09', '2010-12'), values = c('#66cc66', '#48629d')) +
scale_color_manual(name = 'Year', labels = c('2007-09', '2010-12'), values = c('#66cc66', '#48629d')) +
geom_vline(xintercept = med_0709, color = '#66a61e', alpha = 0.8) +
geom_vline(xintercept = med_1012, color = '#48629d', alpha = 0.8) +
geom_text(aes(x = med_0709, y = -0.6, label = paste(med_0709*100, '%', sep = ''), hjust = 1.1), color = '#66cc66') +
geom_text(aes(x = med_1012, y = -0.6, label = paste(med_1012*100, '%', sep = ''), hjust = -0.1), color = '#48629d') +
theme_bw()
After looking at the overall distribution over the two time frames, we represented this statewide food insecurity data on a US map. States are colored according to the three-year average percentage of households with food insecurity in 2007-09 and 2010-12. Overall, the colors in the 2010-12 graph are darker, indicating an increasing number of homes with food security issues. The Southeast in particular stands out as an area where food insecurity is most prevalent.
# Set manual scale for both graphs
# Remove state abbreviation labels
breaks = seq(6, 21, by = 3)
c = StateChoropleth$new(mutate(state_data, value = cut(state_data$FOODINSEC_07_09, breaks = breaks)))
c$title = "Household Food Insecurity (%), 2007-2009"
c$legend = "Food Insecurity %"
c$set_num_colors(length(breaks) - 1)
c$set_zoom(NULL)
c$show_labels = FALSE
c$ggplot_polygon = geom_polygon(aes(fill = value), color = 'gray20')
state_insec_0709 = c$render()
c = StateChoropleth$new(mutate(state_data, value = cut(state_data$FOODINSEC_10_12, breaks = breaks)))
c$title = "Household Food Insecurity (%), 2010-2012"
c$legend = "Food Insecurity %"
c$set_num_colors(length(breaks) - 1)
c$set_zoom(NULL)
c$show_labels = FALSE
c$ggplot_polygon = geom_polygon(aes(fill = value), color = 'gray20')
state_insec_1012 = c$render()
state_insec_0709
state_insec_1012
# Cleveland dot plot theme
theme_dotplot1 <- theme_bw(10) +
theme(axis.text.y = element_text(size = rel(.8)),
axis.ticks.y = element_blank(),
axis.title.x = element_text(),
panel.grid.major.x = element_blank(),
panel.grid.major.y = element_line(size = 0.5),
panel.grid.minor.x = element_blank())
Next, looking at the percentage of food insecure households by state, there are four states - Mississippi (MS), Arkansas (AR), Texas (TX), and Alabama (AL) - that seem to be outliers and experience high food insecurity compared to the rest of the country. For later analyses, these four states are used to signify high food insecurity, and we compare these four states against four states with low food insecurity. According to the Cleveland dot plot below, the four states with the lowest percentage of food insecure homes are North Dakota (ND), Virginia (VA), New Hampshire (NH), and Minnesota (MN). However, from the above missing data analysis, many values for VA values are missing. Similarly, the Atlas does not include any SNAP variables for NH. Since our later analysis assesses the effectiveness of national food assistance programs, we are unable to use NH in this analysis. Thus, excluding VA and NH from the low food security states, the four states we consider as exhibiting low food insecurity are ND, MN, Wisconsin (WI), and Massachusetts (MA).
ggplot(insec_bar, aes(x = reorder(State, FOODINSEC_10_12 / 100), y = FOODINSEC_10_12 / 100)) +
scale_y_continuous(labels = scales::percent) +
geom_point() +
theme_dotplot1 +
coord_flip() +
ggtitle("Household Food Insecurity by State (3-year Avg), 2010-2012") +
ylab("Household food insecurity (3-year avg)") + xlab("")
Looking at the change in average household food insecurity between 2007-09 and 2010-12, it is clear that food insecurity in the US is worsening. For all states except four, the average percentage of households with food insecurity is greater in the later years. In order to combat food insecurity and work towards reducing the number of homes who struggle with food security, it is important to identify related variables.
ggplot(insec_bar_tidy, aes(x = reorder(State, Value), y = Value / 100, color = Year)) +
scale_y_continuous(labels = scales::percent) +
geom_point() +
theme_dotplot1 +
coord_flip() +
ggtitle("Household Food Insecurity by State (3-year Avg), 2007-2009 vs. 2010-2012") +
ylab("Average household food insecurity") +
xlab("") +
scale_color_manual(labels = c('2007-09', '2010-12'), values = c('#66cc66', '#48629d'))
To structure our analysis, we followed the USDA report (USDA, 2016) which outlines factors that “affect food security and access to a healthy diet”. While we are able to assess relationships and observe trends between these factors and food security, it is important to note that we cannot explicitly point out causal relationships.
high_insecurity_states <- c('MS', 'AR', 'TX', 'AL')
low_insecurity_states <- c('ND', 'MN', 'WI', 'MA')
fea_high <- filter(fea, State %in% high_insecurity_states)
fea_high$State <- paste('high', sep = '_', fea_high$State)
fea_low <- filter(fea, State %in% low_insecurity_states)
fea_low$State <- paste('low', sep = '_', fea_low$State)
1. Household income
The USDA reports that household income affects food security. This makes sense - lower incomes may result in inadequate food for the household, in terms of quality, variety, or amount. We used the dataset to make this link between food insecurity and household income, graphically.
We plotted median household income by county to see the household income distribution across the country. A large area of lower median household income is seen in the Southeast and Mideast, where food insecurity is also most prevalent. While this comparison is not perfect in terms of dates (household income data is from 2010, food insecurity is a 3-year average of 2010-12), measurement (household income is expressed as the median, food insecurity data is based on surveys), or granularity (household income is at the county level, food insecurity is a state average), we can see that the regions with higher percentage of food insecure households also have counties with relatively lower median household incomes. Texas is one of the top four states with the greatest food insecurity, on average, but we see that even though some Texas counties have low median household income, there are richer counties in Texas. This is one of the issues with the dataset providing only state average values of food insecurity; we cannot dig deeper into county-level food insecurity values, or even account for the standard error of the state averages.
continent_states <- unique(state_data$region)
continent_states <- continent_states[continent_states != 'alaska' & continent_states != 'hawaii']
blue_color_scale <- c('#084594', '#2171b5', '#4292c6', '#6baed6', '#9ecae1', '#c6dbef', '#eff3ff')
breaks = c(20, 33, 36, 39, 42, 45, 51, 120)
county_choropleth(mutate(fea, value = cut(MEDHHINC10/1000, breaks = breaks)), title = 'Median Household Income by County (2010)', state_zoom = continent_states) +
scale_fill_manual(values = blue_color_scale, name = 'Income ($1000s)')
# Median HH income vs food insecurity
ggplot(fea, aes(x = MEDHHINC10, y = FOODINSEC_10_12 / 100)) + geom_point(alpha = 0.4) +
scale_x_continuous(labels = scales::dollar) +
scale_y_continuous(labels = scales::percent) +
xlab('Median Household Income') +
ylab('Average Food Insecurity') +
ggtitle('State Average Household Food Insecurity (2010-12) \n vs County Median Household Income (2010), with US County Median') +
geom_vline(xintercept = median(fea$MEDHHINC10, na.rm = TRUE), linetype = 'longdash')
Note: The median household income denoted by the dashed line is the median of county median values, our proxy for the median for the US as a whole.
As median household income varies significantly by state, we wanted to compare the distributions of household income for the states with the highest and lowest percentages of food insecurity, against the household income distribution of the US as a whole. Here we see that county median household incomes for highly food insecure states are lower than the US as a whole, while low food insecurity states have incomes distributed higher than the US in general.
ggplot() +
geom_density(data = fea_high, aes(MEDHHINC10, color = State, fill = State), alpha = 0.1) +
geom_density(data = fea, aes(MEDHHINC10, color = 'US', fill = 'US'), alpha = 0.1) +
xlab('Median Household Income (County)') +
ggtitle('Median Household Income (County): \n Highest Food Insecurity States & US (2010)') +
scale_x_continuous(labels = scales::dollar) +
theme_bw()
ggplot() +
geom_density(data = fea_low, aes(MEDHHINC10, color = State, fill = State), alpha = 0.1) +
geom_density(data = fea, aes(MEDHHINC10, color = 'US', fill = 'US'), alpha = 0.1) +
xlab('Median Household Income (County)') +
ggtitle('Median Household Income (County): \n Lowest Food Insecurity States & US (2010)') +
scale_x_continuous(labels = scales::dollar) +
theme_bw()
The Atlas included variables related to poverty (county-level), so we wanted to look at the relationship between poverty and median household income. From the scatterplot below, it is clear there is a strong inverse relationship between these two variables, and could be used as proxies for each other. This makes sense, as household income and family size are the two main factors of measuring poverty. It is important to note, however, that food insecurity is not equivalent to poverty. As defined above, food insecurity describes a household as lacking adequate food in terms of quality and variety, and in extreme cases, quantity. Therefore, while household income, and thus poverty, have a clear link to food insecurity, they differ in their definitions.
# We can use poverty and median hh income as proxies for each other - strong inverse relationship
ggplot(fea, aes(x = MEDHHINC10, y = POVRATE10 / 100)) + geom_point(alpha = 0.4) +
scale_x_continuous(labels = scales::dollar) +
scale_y_continuous(labels = scales::percent) +
xlab('Median Household Income') +
ylab('Poverty Rate') +
ggtitle('Poverty Rate vs Median Household Income (2010), by County')
2. Location
According to the USDA, a household’s location relative to a supermarket or grocery store affects food security. One of our initial hypotheses was that food insecurity was related to low access to food.
To assess the relationship between access to food and food security, we used the variable “low access to store” as a proxy for location. The dataset defines low access to store (%) as the “percentage of people in a county living more than 1 mile from a supermarket, supercenter or large grocery store if in an urban area, or more than 10 miles from a supermarket or large grocery store if in a rural area”. Is low store access more of an issue in counties with high food insecurity?
To answer this question, we looked at the distributions of a state’s low access to food, which is reported at the county level. We compared the four highest and lowest food insecure states, and found that low access to stores does not appear to be associated with food insecurity. In fact, at least half of the counties in AL, AR and MS have low store access values lower than the US county median, while a higher proportion of people in MA, MN and ND, where food insecurity is low, live far from stores. This suggests that physical location to stores is not a main factor related to food insecurity. Given that the definition of food insecurity relates to food quality and variety, it is not surprising that proximity to stores does not have an clear relation. Looking forward, it would be worthwhile to seek out additional data sources to assess the USDA’s claim that location is a main factor affecting food security.
ggplot() +
geom_boxplot(data = fea_high, aes(x = State, y = PCT_LACCESS_POP10 / 100), fill = 'lightpink') +
geom_boxplot(data = fea_low, aes(x = State, y = PCT_LACCESS_POP10 / 100), fill = 'lightblue') +
geom_boxplot(data = fea, aes(x = 'US', y = PCT_LACCESS_POP10 / 100), fill = 'gray90') +
scale_y_continuous(labels = scales::percent) +
ylab('Low Access to Store (County)') +
ggtitle('Low Access to Store (County), with US Median (2010)') +
geom_hline(yintercept = median(fea$PCT_LACCESS_POP10 / 100, na.rm = TRUE), linetype = 'longdash') +
xlim(c(unique(fea_high$State), 'US', unique(fea_low$State))) +
theme_bw()
Another initial hypothesis of ours was that households in states with high food insecurity may be located farther from grocery stores, but they may instead shop more at farmers’ markets. Here we plotted the distributions of store types, against each state’s food insecurity percentage. Across the board, grocery stores dominate in numbers, and farmer’s markets are relatively scarce, at all food insecurity levels. Note, while a stacked bar chart is not recommended, this exploratory chart is effective in showing the minimal effect of the presence of farmers’ markets amongst all food stores.
# Farm analysis
store_farm_09 <- fea %>% select(FIPS, State, County, GROCPTH07, SUPERCPTH07, FMRKTPTH09, FOODINSEC_07_09)
store_farm_09_tidy <- store_farm_09 %>% gather("StoreType", "ValuePer1000", -FIPS, -State, -County,-FOODINSEC_07_09)
##By food insecurity level, store types (07-09)
ggplot(store_farm_09_tidy, aes(x = FOODINSEC_07_09 / 100, y = ValuePer1000, color = StoreType, fill = StoreType)) +
geom_col() +
scale_x_continuous(labels = scales::percent) +
ggtitle('Number of Stores per 1,000 People') +
xlab('State Food Insecurity Rate (2007-09)') +
ylab('Stores per 1,000 people') +
scale_color_manual(name = 'Store Type', labels = c('Farmers\' markets (2009)', 'Grocery stores (2007)', 'Supercenters & \n club stores (2007)'), values = c('#e6ac00', '#66cc66', '#48629d')) +
scale_fill_manual(name = 'Store Type', labels = c('Farmers\' markets (2009)', 'Grocery stores (2007)', 'Supercenters & \n club stores (2007)'), values = c('#e6ac00', '#66cc66', '#48629d'))
3. Prices & Taxes
The USDA reports that food prices and taxes affect food security. We broke down this category into food prices and food taxes.
Prices
The only food price variables in this dataset compare the regional prices of low-fat milk and soda to the national average, and each other. As soda is an unhealthy food choice, we did not want to bring this variable into the story of food insecurity. Instead, we compared the price of low-fat milk to the national average, for the high and low food insecurity states. In the states with high food insecurity, regional milk prices tend to be higher than the US median (seen from the boxplot). WI, MN and ND, states with low food insecurity, tend to have milk prices below the US median, whereas MA, which also has low food insecurity, has higher than US-median milk prices. There does not seem to be a very clear relationship between food insecurity and the price of milk compared to the national average. That being said, this variable is somewhat unclear and may not be a good estimator of a state, or county’s food prices.
ggplot() +
geom_point(data = fea_high, aes(x = State, y = MILK_PRICE10)) +
geom_point(data = fea_low, aes(x = State, y = MILK_PRICE10)) +
geom_boxplot(data = fea, aes(x = 'US', y = MILK_PRICE10), fill = 'gray90') +
ylab('Price of low-fat milk / national average') +
ggtitle('Regional Price of Low-Fat Milk / National Average (2010)') +
scale_x_discrete(limits = sort(c('US', unique(fea_low$State), unique(fea_high$State)), decreasing = T)) +
geom_rect(aes(xmin = 0.5, xmax = 4.5, ymin = -Inf, ymax = Inf), alpha = 0.1, fill = "red") +
geom_rect(aes(xmin = 5.5, xmax = 9.5, ymin = -Inf, ymax = Inf), alpha = 0.2, fill = "cyan3") +
geom_hline(yintercept = 1, alpha = 0.3) +
xlim(c(unique(fea_high$State), 'US', unique(fea_low$State))) +
theme_bw()
Note: The plot above displays the ratio of low-fat milk prices to the national average by region, which is why the median value does not lie at 1.
Taxes
According to the USDA, food taxes are a main factor affecting household food security. The Food Environment Atlas contains a variable for the general food sales tax by state, so we used this to see if there was any connection with food insecurity.
In mapping the general food sales tax, we see that most states do not levy any food sales tax. However, what stands out is that three of the four states with the greatest food insecurity - AR, MS, and AL - all have food sales tax. We wanted to investigate this further, as it is worrisome that households in states where food insecurity is most prevalent are taxed on groceries.
Research clarified that while 45 states and DC impose a general sales tax, most of these states have eliminated, reduced, or offset tax on food for consumption. Six states (UT, MO, AR, IL, VA, TN) tax groceries at a lower rate, and four states (ID, KS, OK, HI) provide rebates for grocery taxes (Figueroa & Waxman, 2017). The three states that apply the full sales tax to food purchased for home consumption are Alabama, Mississippi, and South Dakota. According to the Atlas, Mississippi is the state with the largest percentage of food insecure households, and Alabama is the fourth (based on the state average for 2010-12).
Taxes on groceries are a regressive tax: they have a disproportionate impact on poorer households, which spend a larger percentage of their income on food. It is quite concerning that two states, Mississippi and Alabama, with lower median household incomes compared to the rest of the US, levy the full sales tax rate on groceries. Povich (2016) reiterates that taxing groceries disproportionately hurts the poor, and may affect the quality, variety, and amount of food they can afford - the exact definition of food insecurity. While lower income families spend less on groceries compared to higher income families, what they spend is a larger share of their income: according to Povich (2016), the lowest income Americans spend 34.1% of their income on groceries, compared to 13.4% for middle-income Americans.
Taxes are a reliable and steady revenue source, especially in volatile times; making up for the revenue lost by removing the tax on groceries, which represent 6-7% of consumption, would mean increasing the general sales tax by a full percentage point (Povich, 2016). One Alabama state senator failed at passing legislation to phase out the state’s grocery tax and replace it with a one percent increase in the overall sales tax.
Although food insecurity in Kansas (an average of 14.4% between 2010-2012) is around the country’s average (an average of 14.2% for the same time period), there is an interesting story related to food sales tax in Kansas. While Kansas does offer credit or rebates to offset some of the tax their residents pay for groceries, Kansas still charges the full sales tax on food for consumption in the home. Additionally, local county and city governments can impose their own taxes, so some are taxed as much as 10.5% on groceries (Wong, 2016). Interestingly, Wong (2016) identifies that 35 of Kansas’s 105 counties share a border with a neighboring state where there is either no tax on groceries, or the tax is at a reduced rate. Naturally, this leads to revenue lost by the Kansas state government. Moreover, low-income homes are less likely to have the means to grocery shop in lower-taxed areas, and it is these low-income homes that bear more of the food sales tax burden since it is “difficult for them to avoid” (Wong, 2016).
In summary, while most states have made adjustments or eliminated sales tax on groceries, some states have not followed suit. Some of these grocery-taxing states are also those with lower household incomes and higher rates of food insecurity.
# Define buckets of sales tax
breaks <- c(-1, 0, 2, 4, 6, 8)
state_data %>%
select(region, FOOD_TAX11) %>%
mutate(value = cut(FOOD_TAX11, breaks = breaks)) ->
plot_df
# Change first level to 0
levels(plot_df$value)[levels(plot_df$value) == '(-1,0]'] <- '0'
state_choropleth(plot_df, legend = 'Sales Tax %', title = "General Food Sales Tax (2011)")
4. Transportation
The USDA lists transportation as a factor affecting household food security. Unfortunately the Food Environment Atlas lacks details related to transportation. Moreover, the Atlas presents food insecurity data as a 3-year state average, whereas the USDA report looks at household-level food security. We were unable to assess this USDA claim, as there was no variable in the data set we could use as a proxy for household transportation. One hypothesis we thought of is that lower income households, or households in poverty, are less likely to have means of transportation that more prosperous households have access to. Thus, the link between transportation and food security may be an indicator of household wealth. This would be an interesting analysis to perform in the future.
5. Food And Nutrition Assistance Programs
Within the Food Environment Atlas, data was provided for two major Food and Nutrition Assistance Programs in the US, the Supplemental Nutrition Assistance Program (SNAP) and (Special Food Assistance Program for) Women, Infants and Children (WIC). The USDA considers food and nutrition assistance programs as a main factor affecting household food security, so we have analyzed these Atlas variables in hopes of observing a similar relationship.
Since we were not well acquainted with either program, we sought out additional information to better understand the data. In particular, we were interested in learning how long these programs have existed, details related to eligibility criteria, the method of reimbursement, and changes to these programs in the years of the recent recession(ie. 2008-2013).
SNAP: Supplemental Nutrition Assistance Program
The Supplemental Nutrition Assistance Program (SNAP), formerly known as the Food Stamp Program, provides food-purchasing assistance for low- and no-income people living in the US. SNAP is the largest program in the domestic hunger safety net. The Food and Nutrition Service (FNS) works with state agencies, nutrition educators, and neighborhood and faith-based organizations to ensure that those eligible for nutrition assistance can make informed decisions about applying for the program and can access benefits (Supplemental Nutrition Assistance Program (SNAP), USDA, n.d.).
To obtain SNAP benefits, households must meet certain criteria based on resources, income, deductions, employment requirements, special rules for elderly or disabled, and immigrant eligibility. All of the above tests are in the form of a questionnaire, and if needed, households receive help in filling these forms out. Once the questionnaire is answered, no third party proof is required.
Until the late 1990s, the program issued paper-denominated “stamps” or coupons, to be torn out individually and used in single-use exchange. In the late 1990s, the Food Stamp Program was revamped, and a new debit card system, Electronic Benefits Transfer (EBT), was brought in. By June 2004, all states had phased out the coupon system, and the program was renamed as SNAP (Supplemental Nutrition Assistance Program (SNAP), USDA, n.d.).
During the Recession:
It is important to note that SNAP benefits temporarily increased with the passage of the American Recovery and Reinvestment Act of 2009 (ARRA), a federal stimulus package to help Americans affected by the Great Recession (Plumer, 2013). Beginning in April 2009 and continuing through the expansion’s expiration on November 1, 2013, the ARRA appropriated $45.2 billion to increase monthly benefit levels to an average of $133. This amounted to a 13.6 percent funding increase for SNAP recipients. Note, the Atlas includes data related to SNAP during this time period, thus it is important to be careful of yearly comparisons due to the increase in benefits during this time.
WIC: (Special Supplemental Nutrition Program for) Women, Infants and Children
The Special Supplemental Nutrition Program for Women, Infants and Children (WIC) is a federal assistance program provided by the USDA’s Food and Nutrition Service (FNS) for healthcare and nutrition of low-income pregnant women, breastfeeding women, and infants and children under the age of five. Its mission is to be a partner with other services that are key to childhood and family well-being.
Applicants to the WIC program must meet eligibility requirements in four areas: categorical (Women/Infants/Children), income, residential, and nutrition risk. The fourth eligibility criteria requires a nutritional risk assessment by a qualified health professional.
During the Recession:
The WIC program is primarily funded through two separate federal grants: the food grant, and the nutrition services and administration (NSA) grant. Total funding increased between 2009–2011, then began to decrease in 2012 (FNS, USDA, Funding and Program Data, n.d.).
Since 2008, WIC has seen spending rise and fall. From 2008 to 2011, the total amount spent on programs rose from nearly $6.2 billion to roughly $7.2 billion. In 2012, the amount spent began to fall to about $6.8 billion, possibly due to the decreasing number of participants (FNS, USDA, WIC Participation and Cost, n.d.).
Since 2008, WIC has also seen a significant decrease in participation, whereas SNAP participation continues to increase nationwide (Prah, 2012). The media has explained this trend with several reasons, including:
- A decline in the overall US birth rate
- Food stamps (from SNAP), now available in the form of unobtrusive debit-like cards, are easier to acquire and use
- WIC applicants have different and additional hurdles to clear before they can receive benefits
- When the recession hit, there was a coordinated government effort to sign people up for SNAP, but the same did not happen with WIC
- With the 2009 increases mentioned above, SNAP benefits were more generous than WIC
With this better understanding of SNAP and WIC, we explored the data to see if it shares the same story (for the recession years as we have highlighted above).
# selecting state variables for SNAP
df_SNAP_statevar <- select(fea, FIPS, State, County, PCH_SNAP_09_14, SNAP_PART_RATE08, SNAP_PART_RATE10)
# selecting state variables for WIC
df_WIC_statevar <- select(fea, FIPS, State, County, PCH_WIC_09_14)
# Removing rows(aka counties) which have NA values, since we later want to group by States and be left with single row per state
df_SNAP_statevar <- df_SNAP_statevar[complete.cases(df_SNAP_statevar), ]
df_WIC_statevar <- df_WIC_statevar[complete.cases(df_WIC_statevar), ]
# Finally creating the SNAP and WIC state dataframe. Note, it doesn't matter what we summarize on, Mean/Median/Min/Max; all counties, within the state, have the same value
fea %>%
select(-region) %>%
group_by(State) %>%
summarise_all(mean) ->
state_data
df_SNAP_statevar %>%
group_by(State) %>%
summarise_all(min) ->
df_SNAP_state
df_WIC_statevar %>%
group_by(State) %>%
summarise_all(min) ->
df_WIC_state
df_SNAP_WIC_state <- join(select(df_SNAP_state, State, PCH_SNAP_09_14), select(df_WIC_state, State, PCH_WIC_09_14), by = 'State')
df_SNAP_WIC_state <- mutate(df_SNAP_WIC_state, SNAP_dup = PCH_SNAP_09_14)
df_SNAP_WIC_state <- df_SNAP_WIC_state %>% gather("Program", "Value", -State, -SNAP_dup)
ggplot(df_SNAP_WIC_state, aes(x = reorder(State, SNAP_dup), y = Value / 100, fill = Program)) +
geom_col(position = 'dodge') +
scale_y_continuous(labels = scales::percent) +
geom_hline(yintercept = 0) +
coord_flip() +
ylab("Change in Participation") +
xlab("") +
ggtitle("Percent Change in SNAP and WIC participation from 2009 to 2014") +
scale_fill_manual(labels = c('SNAP', 'WIC'), values = c('#f1a340', '#998ec3')) +
theme_bw()
The above graph confirms numerous USDA and media reports that SNAP participation increased from 2009 to 2014. Nevada was the only state that saw a decrease in participation over these years. On the contrary, the above graph depicts a decrease in WIC participation across all states, except in New York and Minnesota.
Now we will dive deeper into the previous findings by analyzing the SNAP and WIC redemptions in authorized stores in 2008 and 2012. We compared this data for four states with high food insecurity (MS, AL, AR, TX) and four states with low food insecurity (ND, MN, WI, MA). (Note: these are the same states used in the preceding household income, location, and food prices analysis.)
# Lets dive deeper into the previous findings by taking a closer look at the SNAP and WIC redemptions in authorized stores in 2008 and 2012. We will look at the data of five states with high food insecurity and five states with low food insecurity here.
# Four Target States with highest food insecurity[according to USDA 2013-2014]: Mississippi,Arkansas, Texas, Alabama
# Four Target states with lowest food insecurity[according to USDA 2013-2014]: North Dakota,Minnesota, Wisconsin, Massachusetts
#selecting county level variables for SNAP
df_SNAP_countyvar <- select(fea, FIPS, State, County, SNAPSPTH08, SNAPSPTH12, PCH_SNAPSPTH_08_12, REDEMP_SNAPS08, REDEMP_SNAPS12, PCH_REDEMP_SNAPS_08_12, PC_SNAPBEN08, PC_SNAPBEN10, PCH_PC_SNAPBEN_08_10)
#Selecting county level variables for WIC
df_WIC_countyvar <- select(fea, FIPS, State, County, PC_WIC_REDEMP08, PC_WIC_REDEMP12, REDEMP_WICS08, REDEMP_WICS12)
#removing counties for which SNAP information is not available
df_SNAP_countyvar <- df_SNAP_countyvar[complete.cases(df_SNAP_countyvar), ]
#removing counties for which WIC information is not available
df_WIC_countyvar <- df_WIC_countyvar[complete.cases(df_WIC_countyvar), ]
# keeping counties for only target states
df_SNAP_targetcounty <- filter(df_SNAP_countyvar, State %in% c(high_insecurity_states, low_insecurity_states))
df_WIC_targetcounty <- filter(df_WIC_countyvar, State %in% c(high_insecurity_states, low_insecurity_states))
#Adding new columns to contain redemption values in 1k scale
df_SNAP_targetcounty <- df_SNAP_targetcounty %>%
mutate(Redemp08_1K = REDEMP_SNAPS08 / 1000.00, Redemp12_1K = REDEMP_SNAPS12 / 1000.00)
df_WIC_targetcounty <- df_WIC_targetcounty %>%
mutate(Redemp08_1K = REDEMP_WICS08 / 1000.00, Redemp12_1K = REDEMP_WICS12 / 1000.00)
g1 <- ggplot(df_SNAP_targetcounty, aes(x = reorder(State, +Redemp08_1K), y = Redemp08_1K)) +
geom_boxplot(aes(fill = ifelse(State %in% low_insecurity_states, 'lightblue', 'lightpink'))) +
scale_fill_manual(values = c('lightblue', 'lightpink'), labels = c('Low', 'High'), guide = F) +
xlab("State") +
ylab("Average Redemption (in $1k)") +
ggtitle("Average Monthly SNAP Redemptions in Authorized Stores (2008), \nwith US County Median") +
scale_y_continuous(limits = c(0, 600), labels = scales::dollar) +
geom_hline(yintercept = median(df_SNAP_countyvar$REDEMP_SNAPS08 / 1000, na.rm = TRUE), linetype = 'longdash') +
theme_bw()
g3 <- ggplot(df_SNAP_targetcounty, aes(x = reorder(State, +Redemp12_1K), y = Redemp12_1K)) +
geom_boxplot(aes(fill = ifelse(State %in% low_insecurity_states, 'lightblue', 'lightpink'))) +
scale_fill_manual(values = c('lightblue', 'lightpink'), labels = c('Low', 'High'), guide = F) +
xlab("State") +
ylab("Average redemption(in $1k)") +
ggtitle("Average Monthly SNAP Redemptions in Authorized Stores (2012), \nwith US County Median") +
scale_y_continuous(limits = c(0, 600), labels = scales::dollar) +
geom_hline(yintercept = median(df_SNAP_countyvar$REDEMP_SNAPS12 / 1000, na.rm = TRUE), linetype = 'longdash') +
theme_bw()
g2 <- ggplot(df_WIC_targetcounty, aes(x = reorder(State, +Redemp08_1K), y = Redemp08_1K)) +
geom_boxplot(aes(fill = ifelse(State %in% low_insecurity_states, 'lightblue', 'lightpink'))) +
scale_fill_manual(name = 'Food Insecurity\nCategory', values = c('lightblue', 'lightpink'), labels = c('Low', 'High')) +
xlab("State") +
ylab("Average Redemption (in $1k)") +
ggtitle("Average Monthly WIC Redemptions in Authorized Stores (2008), \nwith US County Median") +
scale_y_continuous(limits = c(0, 600), labels = scales::dollar) +
geom_hline(yintercept = median(df_WIC_countyvar$REDEMP_WICS08 / 1000, na.rm = TRUE), linetype = 'longdash') +
theme_bw()
g4 <- ggplot(df_WIC_targetcounty, aes(x = reorder(State, +Redemp12_1K), y = Redemp12_1K)) +
geom_boxplot(aes(fill = ifelse(State %in% low_insecurity_states, 'lightblue', 'lightpink'))) +
scale_fill_manual(name = 'Food Insecurity\nCategory', values = c('lightblue', 'lightpink'), labels = c('Low', 'High')) +
xlab("State") +
ylab("Average Redemption (in $1k)") +
ggtitle("Average Monthly WIC Redemptions in Authorized Stores (2012), \nwith US County Median") +
scale_y_continuous(limits = c(0, 600), labels = scales::dollar) +
geom_hline(yintercept = median(df_WIC_countyvar$REDEMP_WICS12 / 1000, na.rm = TRUE), linetype = 'longdash') +
theme_bw()
grid.arrange(g1, g2, g3, g4, ncol = 2)
The following key observations can be made from the above graphs:
- Average monthly SNAP redemptions increased across all 8 states from 2008 to 2012. The US median also increased from $185.14 to $265.84.
- There is no clear pattern for WIC redemptions. The US median remained consistent, and the state medians either declined or remained the same. Unlike SNAP, none of these states saw an increase in average monthly redemptions.
- In both 2008 and 2012, the US median for average monthly SNAP redemptions was notably higher than the average monthly WIC redemptions.
- As should be the case, average redemptions were generally higher in states with higher food insecurity.
Since SNAP seems to be the popular choice of food assistance program, especially in recent years, we decided to explore some of the other state and county variables associated with this program.
Earlier we saw how SNAP participation increased across all US states (except Nevada) from 2008 to 2012. Now, we will consider SNAP participation as a percentage of eligible participants in the years 2008 and 2012. This shall give us more concrete proof of whether the eligible participation increased or not.
df_SNAP_state %>%
select(State, SNAP_PART_RATE08, SNAP_PART_RATE10) %>%
gather("Year", "Value", -State) ->
SNAP_per_tidy
med_08 <- median(filter(SNAP_per_tidy, Year == 'SNAP_PART_RATE08')$Value / 100)
med_10 <- median(filter(SNAP_per_tidy, Year == 'SNAP_PART_RATE10')$Value / 100)
ggplot(SNAP_per_tidy, aes(x = reorder(State, Value), y = Value/100, color = Year)) +
scale_y_continuous(labels = scales::percent) +
geom_point() +
theme_dotplot1 +
coord_flip() +
ggtitle("SNAP Participation as a % of Eligible Population, 2008 vs 2010") +
ylab("SNAP participants (% eligible pop)") + xlab("") +
scale_color_manual(labels = c('2008', '2010'), values = c('#66cc66', '#48629d'))+
geom_hline(yintercept = med_08, color = '#66cc66', alpha = 0.8) +
geom_hline(yintercept = med_10, color = '#48629d', alpha = 0.8) +
geom_text(aes(x = 1.65, y = med_08, label = paste(med_08*100, '%', sep = ''), hjust = 1.1), color = '#66cc66') +
geom_text(aes(x = 1.75, y = med_10, label = paste(med_10*100, '%', sep = ''), hjust = -0.1), color = '#48629d')
The following key observations can be made from the above graph:
- Median SNAP participation as a percentage of eligible populations increased from 66% in 2008 to 78% in 2012
- For the majority of states, including Nevada, the participation as a percentage of eligible population increased from 2008 to 2012
Next, we compared SNAP authorized stores per 1000 people at the county level for the four food secure and four food insecure states.
# County Level Exploration
# Boxplot for SNAP authorized stores/1000 population in 2008 across counties of these 9 states
g1 <- ggplot(df_SNAP_targetcounty, aes(x = reorder(State, +SNAPSPTH08), y = SNAPSPTH08)) +
geom_boxplot(aes(fill = ifelse(State %in% low_insecurity_states, 'lightblue', 'lightpink'))) +
scale_fill_manual(name = 'Food Insecurity\nCategory', values = c('lightblue', 'lightpink'), labels = c('Low', 'High')) +
xlab("State") +
ylab("Stores to Population Ratio") +
ggtitle("SNAP Authorized Stores per 1,000 people (2008), with US County Median") +
ylim(0, 3) +
geom_hline(yintercept = median(df_SNAP_countyvar$SNAPSPTH08, na.rm = TRUE), linetype = 'longdash') +
theme_bw()
# Boxplot for SNAP authorised stores/1000 population in 2012 across counties of these 9 states
g2 <- ggplot(df_SNAP_targetcounty, aes(x = reorder(State, +SNAPSPTH12), y = SNAPSPTH12)) +
geom_boxplot(aes(fill = ifelse(State %in% low_insecurity_states, 'lightblue', 'lightpink'))) +
scale_fill_manual(name = 'Food Insecurity\nCategory', values = c('lightblue', 'lightpink'), labels = c('Low', 'High')) +
xlab("State") +
ggtitle("SNAP Authorized Stores per 1,000 people (2012), with US County Median") +
ylim(0, 3) +
theme(axis.title.y = element_blank()) +
geom_hline(yintercept = median(df_SNAP_countyvar$SNAPSPTH12, na.rm = TRUE), linetype = 'longdash') +
theme_bw()
grid.arrange(g1, g2, ncol = 2)
Here we observe how the number of stores/1000 people is generally higher in states with higher food insecurity. Notice how the number of stores/1000 population increased across nearly each of these states, and, consequently, the US median. This is expected given the recession and the response of the Obama government.
As noted above, during the recession period, the US government increased SNAP benefits in all states, to help combat food insecurity due to unemployment. Below, we look at the SNAP benefits per capita for our chosen 8 states, to see if we can confirm this fact.
high_low_states <- c('mississippi', 'arkansas', 'texas', 'alabama', 'north dakota', 'minnesota', 'wisconsin', 'massachusetts')
high_states <- c('mississippi', 'arkansas', 'texas', 'alabama')
green_color_scale <- c('#edf8fb', '#ccece6', '#99d8c9', '#66c2a4', '#41ae76', '#238b45', '#005824')
county_choropleth(mutate(fea, value = PC_SNAPBEN10), state_zoom = high_low_states, title = 'SNAP Benefits per Capita (2010), Highest and Lowest Food Insecurity States') + scale_fill_manual(values = green_color_scale, na.value = 'gray50', name = 'Benefits per capita ($)')
# Lastly we look at SNAP benefits per capita
g1 <- ggplot(df_SNAP_targetcounty, aes(x = reorder(State, +PC_SNAPBEN08), y = PC_SNAPBEN08)) +
geom_boxplot(aes(fill = ifelse(State %in% low_insecurity_states, 'lightblue', 'lightpink'))) +
scale_fill_manual(name = 'Food Insecurity\nCategory', values = c('lightblue', 'lightpink'), labels = c('Low', 'High')) +
xlab("State") +
ylab("Avg benefits per capita") +
ggtitle("Average Monthly SNAP Benefits Per Capita (2008), with US County Median") +
scale_y_continuous(limits = c(0, 60), labels = scales::dollar) +
geom_hline(yintercept = median(df_SNAP_countyvar$PC_SNAPBEN08, na.rm = TRUE), linetype = 'longdash') +
theme_bw()
g2 <- ggplot(df_SNAP_targetcounty, aes(x = reorder(State, +PC_SNAPBEN10), y = PC_SNAPBEN10)) +
geom_boxplot(aes(fill = ifelse(State %in% low_insecurity_states, 'lightblue', 'lightpink'))) +
scale_fill_manual(name = 'Food Insecurity\nCategory', values = c('lightblue', 'lightpink'), labels = c('Low', 'High')) +
xlab("State") +
ylab("Average monthly SNAP benefits per capita") +
ggtitle("Average Monthly SNAP Benefits Per Capita (2010), with US County Median") +
scale_y_continuous(limits = c(0, 60), labels = scales::dollar) +
geom_hline(yintercept = median(df_SNAP_countyvar$PC_SNAPBEN10, na.rm = TRUE), linetype = 'longdash') +
theme_bw()
grid.arrange(g1, g2, ncol = 2)
The above graphs confirm that the amount of SNAP benefits increased across all states from 2008 to 2012. The states with higher food insecurity seem to have much a higher SNAP benefits per capita amount compared to the states with lower food insecurity.
While this was reassuring, we wanted to see how uniform this increase in per capita benefits was within a state where food insecurity is high. In the following analysis, we focused on the counties of Mississippi.
df_benefits_perchange <- select(df_SNAP_targetcounty, FIPS, State, County, PCH_PC_SNAPBEN_08_10)
df_MSbenperchange <- filter(df_benefits_perchange, State == "MS")
df_MSbenperchange <- df_MSbenperchange %>% mutate(Year = 2010)
colnames(df_MSbenperchange)[4] <- "Percentage_Change"
ggplot(df_MSbenperchange, aes(Year, County, fill = Percentage_Change)) +
geom_tile() +
scale_fill_viridis() +
ggtitle("Percent Change in SNAP Benefits (2008-2010) Per Capita in Mississipi counties") +
ylab("County") +
theme(axis.text.x = element_blank(), axis.ticks.x = element_blank())
This heat map shows that the increase in amount of SNAP benefits was not uniform across all counties. A significant number of counties saw less than a 60% increase in the benefit amount. With Mississippi having the highest statewide food insecurity rate, we expected to see more counties with a higher increase in SNAP benefits.
In summary, we consider the overarching question: are food assistance programs instrumental in lowering food insecurity?
Unfortunately we can not confirm or deny this question given the dataset, as no household-level information was available. The state- and county-level data available is for the years 2008-2012, when many were affected by the recession. With factors like a fragile economy and low employment rates in play, determining the effectiveness of SNAP participation in reducing food insecurity is not possible without household level information.
Also, as stated in the report “Food Insecurity and Hunger in the United States: An Assessment of the Measure” by the National Academies’ Committee on National Statistics, “Prevalence estimates of food insecurity as currently obtained are not well suited for evaluation of the effectiveness of food assistance programs. It is unclear that monitoring the prevalence of food insecurity at national and subnational levels would be suitable for evaluation of these programs” (National Research Council, 2005).